其他
【大数据哔哔集20210113】Hive的动态分区和静态分区
Hive中支持两种模式的分区:
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
-- Hive默认配置值
-- 开启或关闭动态分区
hive.exec.dynamic.partition=false;
-- 设置为nonstrict模式,让所有分区都动态配置,否则至少需要指定一个分区值
hive.exec.dynamic.partition.mode=strict;
-- 能被mapper或reducer创建的最大动态分区数,超出而报错
hive.exec.max.dynamic.partitions.pernode=100;
-- 一条带有动态分区SQL语句所能创建的最大动态分区总数,超过则报错
hive.exec.max.dynamic.partitions=1000;
-- 全局能被创建文件数目的最大值,通过Hadoop计数器跟踪,若超过则报错
hive.exec.max.created.files=100000;
-- 根据个人需要配置
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=1000;
set hive.exec.max.dynamic.partitions=10000;
set hive.exec.max.created.files=1000000;
实战演示
CREATE TABLE student_data(id STRING, name STRING, year INT, major INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
CREATE TABLE student_static_partition(id STRING, name STRING)
PARTITIONED BY (year INT, major INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
INSERT INTO TABLE student_static_partition PARTITION(year=2001,major=810)
SELECT id,name FROM student_data WHERE year=2001 AND major=810;
INSERT INTO TABLE student_static_partition PARTITION(year=2001,major=820)
SELECT id,name FROM student_data WHERE year=2001 AND major=820;
INSERT INTO TABLE student_static_partition PARTITION(year=2002,major=810)
SELECT id,name FROM student_data WHERE year=2002 AND major=810;
INSERT INTO TABLE student_static_partition PARTITION(year=2002,major=820)
SELECT id,name FROM student_data WHERE year=2002 AND major=820;
SHOW PARTITIONS student_static_partition;
+----------------------+--+
| partition |
+----------------------+--+
| year=2001/major=810 |
| year=2001/major=820 |
| year=2002/major=810 |
| year=2002/major=820 |
+----------------------+--+
CREATE TABLE student_dynamic_partition(id STRING, name STRING)
PARTITIONED BY (year INT, major INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT OVERWRITE TABLE student_dynamic_partition PARTITION (year, major)
SELECT id,name,year,major
FROM student_data;
SHOW PARTITIONS student_dynamic_partition
+----------------------+--+
| partition |
+----------------------+--+
| year=2001/major=810 |
| year=2001/major=820 |
| year=2002/major=810 |
| year=2002/major=820 |
+----------------------+--+
-- Hive默认配置值
-- 开启或关闭动态分区
hive.exec.dynamic.partition=false;
-- 设置为nonstrict模式,让所有分区都动态配置,否则至少需要指定一个分区值
hive.exec.dynamic.partition.mode=strict;
-- 能被mapper或reducer创建的最大动态分区数,超出而报错
hive.exec.max.dynamic.partitions.pernode=100;
-- 一条带有动态分区SQL语句所能创建的最大动态分区总数,超过则报错
hive.exec.max.dynamic.partitions=1000;
-- 全局能被创建文件数目的最大值,通过Hadoop计数器跟踪,若超过则报错
hive.exec.max.created.files=100000;
-- 根据个人需要配置
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=1000;
set hive.exec.max.dynamic.partitions=10000;
set hive.exec.max.created.files=1000000;
【大数据哔哔集20210112】Sorry,Hbase的LSM Tree真的可以为所欲为!
【大数据哔哔集20210111】HDFS中的常用压缩算法及区别
【大数据哔哔集20210110】后起之秀ClickHouse的优缺点和核心特性